-
Notifications
You must be signed in to change notification settings - Fork 169
Added support for quantizing TEGroupedMLP for megatron-lm #403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jenny Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Signed-off-by: Jennifer Chen <[email protected]>
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #403 +/- ##
==========================================
- Coverage 73.79% 73.77% -0.03%
==========================================
Files 171 171
Lines 17591 17603 +12
==========================================
+ Hits 12982 12986 +4
- Misses 4609 4617 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
if parallel_state.expert_tensor_parallel_group is not None: | ||
quantizer.sync_amax_across_distributed_group( | ||
parallel_state.expert_tensor_parallel_group | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tensor parallel sync here is not handled correctly across various cases. See the comments before sync_quantizer_amax_across_tp
for more details
Signed-off-by: Kinjal Patel <[email protected]>
Signed-off-by: Kinjal Patel <[email protected]>
963657d
to
22bfe0e
Compare
What does this PR do?
Type of change: ?
Overview: ?
Usage
# Add a code snippet demonstrating how to use this
Testing
Before your PR is "Ready for review"
Additional Information